So yeah, a little bit about me. No one really cares about who the speaker is, so read this
whenever you want. I do podcasting and blogging and shit like that. I am a firm believer in
the wisest man is he that knows that he knows nothing. I am not an expert, and I'm quite
happy to admit that I know absolutely nothing. I kind of like edge case stuff. It's kind
of ‑‑ it's freaky. Most people are like, oh, it's just an edge case. No one really
cares. But the edge case stuff, the freaky stuff, the weird stuff kind of really interests
me. It makes me think that's really cool. I should really dive into that. So this is
part of the edge case stuff that people really don't give a shit about. But apparently all
you guys do. Otherwise you wouldn't have woken up early to come and see this.
So I'm going to start with a quick warning. This presentation, it contains numbers and
jokes and traces of peanuts.
Who has ever seen me give a talk before? So if you had, you wouldn't be here. So everyone
who has seen me give a talk before, I'm sorry, I use the same jokes every time. So just laugh
when everyone else does. Not you, Ed, you have to stay. So I'm going to give everyone
the TLDR and what I'm going to be talking about today. The goal of this talk is to describe
the defensive uses of
HTTP status codes. That sounds really sexy. Doesn't it?
Yeah, this is an absolute must see at DEF CON on a Sunday morning.
So back to why are you guys awake again? I'm going to run through the what, the why,
the how, the goals, and then we're going to bring it together and we're going to review
what we've covered. So I'm going to try and run through this reasonably fast so I can
get everything in.
So let's start with the one, so HTTP status code, so who has never seen an HTTP status
code?
Yeah.
Okay.
I thought so.
So we know what an HTTP request looks like.
We're picking out the specific section, this is the status code or the response code.
The terms are interchangeable depending on how much you had to drink.
It's like a really small little thing every time you make a request or every time you
get an answer from a server, it comes with some kind of status code.
No one really cares what they are, the browser doesn't really tell you what it is, but it's
an important feature of the HTTP standard.
What I'm going to show you is like a really small detail, but it's a really big impact.
If you don't really pay attention to the status codes, then some bad things can seriously
happen.
So a little bit of history on HTTP status codes, there's an RFC, I'm sure everyone
here has read that.
Right?
I couldn't sleep last night.
I thought maybe I should read the RFC again, but yeah, that didn't happen.
So there's five main classes of responses.
You get the one 100, which is the informational stuff.
You don't get to see that very much.
You get the 200, which is most of the time success.
Your web page is here.
Here is the content.
Thank you very much.
Please go away.
You get the 300, which is the redirect stuff.
You get the 400, which means you fucked up.
You get the 500, which means they fucked up.
Simple as that.
There's also a wonderful RFC, and this one is actually worth reading, for the 700 codes,
by John Barton.
If you go to his GitHub page, there's an entire section.
I like the meh.
I am not a teapot.
I specifically like this one, fucking Unicode.
No.
Thank you.
Thank you.
There's like 300 of these.
So I have no idea how they squeezed it all into the 700 range, because there are 300
of them.
Yeah, these are amazing.
I have no idea where did the 600 range go?
So it's, yeah, I really hope they accept that RFC and start implementing that and some
stuff.
So let's go through the basic stuff.
This is the theory bit.
It's boring.
I'm sorry.
But this is the theory.
So you get the 100 informationals.
You get the 100 continues.
Switched projects.
protocols, processing, as I said, these are the things you don't get to see very much.
Moving into the 200 stuff, it means it worked. It means it understood. So you're getting
a 200 okay, which is most of the web is returning a 200 okay. You also get some weird stuff
that you don't get to see very much, like 204 no content. Great, thanks for the header.
There's also some interesting stuff that isn't supported by Apache. But, yeah, low on storage
space. I've never seen that one returned by a server, but I'd really like it. But they're
not supported by Apache, so. Gesundheit. You get the 300 redirection stuff. So most
people know what a 301 is, what a 302 is. 300, you don't get to see very much. Multiple
choices. What is it? It's like an exam. They give you tick boxes or something. 304 not
modified as well is something you see quite a lot if you're looking at the way data flows
backwards and forwards. You also get some weird stuff that isn't used anymore like switch
proxy. Sounds like fun. Use proxy is also interesting. If you return a proxy setting
in the location header, it says, oh, you should use this proxy for your communications. I'm
sure no one would use that for malicious purposes in any way.
So moving on to the 400, which is the you fucked up section. So forbidden, unauthorized,
usual stuff. 404 not found being quite a reasonable response for I search on random crap on the
Internet.
Then you get some interesting stuff. I quite like the 407 proxy authentication required.
I'll talk a little bit more about that later on. There's some interesting ways to do some
malicious stuff with that. This is a long list. I quite like the 418. I'm a teapot.
It's only in April Fools and quite a lot of web servers don't implement it, but this is
interesting. It's like I am indeed a teapot. Moving into the 500 stuff, you get a lot of
stuff. Internal server error, which is unfortunately used quite a lot for SQL injection. Oh, I got
a 500 error. That must be a SQL injection trigger. You don't get to see it very much
if you're not abusing websites. It's still ‑‑ it's interesting. So, wow, that's a lot of
fucking numbers.
All right. So everyone here knows every single HTTP response code now, right? Great.
I don't have to talk about them anymore. So why are we doing this? Okay. It started off
as a little idea.
I read a couple of books. I do that on occasion. I don't know if any of the authors are in
the room. If so, I'm really sorry. I'm plagiarizing your work. And I started to have a little
think. What can we do with that? We can do some interesting stuff with scanners. And
screwing with ScriptKitties is a hobby, should we say. It's more of a life calling.
You know, and that sounds like fun. That sounds like something I want to do at a weekend.
And I started to look around, and there's a guy that everyone follows on Twitter, right?
The gruck, who's not here because he's drunk, or he should be. He said in one of his quotes,
stop dismissing obscurity as a security feature because unpredictability in your defense works
to your advantage. Well, that's great. We can increase attacker costs being unpredictable,
and we can waste attacker time. We can say that, you know, an attacker is going to waste,
you know, an encounter. We can say, you know, we can say, you know, you're incredible, you're
waste three hours attacking a website that should take 15 minutes. Why should we just
stand there and let someone smash us in the face? We should be more active on our defense.
So there's some prior art. I kind of looked around. I was trying to find who else has
talked about this stuff before because in my mind this was just obvious stuff. I was
like someone is bound to have already implemented this stuff. There was a 2004 talk by Haroon
Mir and the guys at Sensepost where they mentioned using HTTP status codes to slow
down attackers. It was a one‑line comment in a slide deck. That was it. Okay. Well,
someone has mentioned it, so there must be more.
There was an interesting paper by Gunter Ohlum, and I've probably totally mispronounced his
name there. There's a PDF where he talks about stopping automated attack tools. I was like,
okay, well, this must cover it. He covers lots and lots of stuff, but he doesn't really
dig deep.
So I carried on, and I was informed, shall we say, of a mailing list comment where Ryan
Barnett said that, well, maybe we can reply with a 503 with a retry after header. That
was interesting. I tested it out. It didn't really seem to work because most scanners
just completely ignore a retry after header. But maybe things have moved on. This was 2006
when that comment was made. So, yeah, so no one seems to have discussed
this stuff. So how are we going to do ‑‑ how are we going to work this in?
So browsers have to be as flexible as possible. You get websites written by kiddies in notepad,
and you get professional websites written in HTML5 with web APIs and SOAP interfaces,
and the browser has to be able to support it all, okay? And this leads to a certain
amount of flexibility on how things are understood, how things are supported. And, of course,
there's RFCs, which some would say is the dark side.
They're more of a guideline, really. This is the way you should do it, but we're not
going to tell you exactly how, so it depends how drunk you were when you read it. Maybe
this makes sense. So, yeah, what could possibly go wrong? You have a 300‑page RFC and people
who are going to interpret it, and you implement it into a piece of software which has to be
as flexible as possible. Otherwise things don't work.
Things are going to start to go wrong. So I started to do a little bit of testing.
I wanted to restrict myself to the big three, Internet Explorer, Chrome, or Chromium and
Firefox. No.
Apparently Opera turned really bad, or there's Lynx, but who uses Lynx?
Okay, it's like one guy uses Lynx. Welcome to the 20th century.
And I wanted to take the easy option on testing, so I thought, okay, well, there's
man in the middle proxy. I don't know if anyone has used it. It's a Python‑based system
for man in the middle and HTTP connections. It allows you to set up this really interesting
reverse proxy. It's all script‑based, and it does all the work for you. You just write
something like this in a script file. That's it. All this does is change every response
code to a 200. Okay? That's easy. Even I can code that shit. Unfortunately, this is
a proxy that uses up all the memory you have on your machine, even if you have 8 gigabytes
of memory in your laptop. I would highly recommend man in the middle dump, which doesn't cache
everything into memory. So, yeah, just a side note there. I also used PHP, the amazing
personal home page. It allows you to set response codes, but there's some downsides. You can
also prepend a file, which is going to allow you to set specific response codes or do some
logic in the background. The problem is if the web server says, no, this is an incorrect
request, the PHP is never going to get to actually see the request, so you can't set
response codes. So you can do some interesting stuff. For testing it's useful, but in production
it's not really going to be as useful as it could be. So I used a mismatch of Python
and PHP for most of the testing. You don't need to write this down because I'll release
all the slides afterwards. But simple testing of browsers, how things
are supported, you just call a PHP page which sets the response code, you tell it what code
in the URL, and you hope no one cross-site scripts your website. As simple as that. You
get a response, you get the requested response code and the actual response code, because
with PHP you can set that response code to 999 if you want, but Python is just going
to say, oh, sorry, Apache is just going to turn around and say, I don't know what that
response code is. And then it just returns a 500, which is the, they've got to do it.
They fucked up a section of it. So, yeah, you get to see what you requested, you get
to see what the response was, and you get to see what the headers are. So simple JavaScript
running in the browser. Okay, so great. So I can run off and start testing these browsers.
Which seemed like an easy thing. I started to think, I've got all this data on how all
these browsers work, and how can I graphically display that in a nice fashion? Let's just
say I'm not good at charting. Sorry for the women in the room. Also, I'm trying to keep
this even across the sexes. So I didn't really know how to display this. So I spoke to some
guys who do visualization and they were like, oh, yeah, we can do this and we can do that,
and it was just all shit. So what I ended up with was a table.
Yeah, this is the reason why I cut it down to three browsers because otherwise the table
would be like this fucking wide. So, yeah, there's a full table online which I'll release
after the talk which has a lot of other browsers and scripting languages in it. But this is
the core three. So you start to see quite a lot of conformity in this section about
how the browsers respond to things. So I run it into three sections. So can you load HTML
with a 100 response code? Well, you can, but nothing appears. It doesn't render anything.
So the browser just doesn't support it. Unless it's an iFrame with Chrome, in which case
it tries to download it. Because I guess Chrome just likes to download shit.
What's kind of interesting, if you have Chrome on an Android phone and you respond with a
100 code, it tries to download it but it never finishes. So you have to restart your Android
to get it to stop, which is fun.
I'm sure no one would fuck with that.
So looking at this, you're like, okay, well, all the browsers are kind of mostly conformed,
except IE, obviously, which doesn't care about a 205 and just renders the stuff. It doesn't
really care. You start to see some differences when you're looking at the 300 codes. For
example, Firefox just doesn't load JavaScript if you respond with a 300 or a 301. IE pretty
much just ignores everything.
With anything in the 300 range, which isn't a redirection. But Chrome pretty much accepts
everything because it just doesn't give a shit. It's the honey badger of browsers.
Moving kind of into the 400 stuff, you start to see more conformity. Again, IE being the
weird one. It just doesn't like loading iFrames with weird response codes. And you see these
at a time when it says 407, proxy, proxy, proxy. Basically what that means is that Chrome,
depending if you have a proxy set or not, will load it. So if you have a proxy, an HTTP
proxy set, then Chrome will load the content when it's responded with a 407. If you don't,
then it won't.
Again with the 500s, things are pretty standard. Again, IE being kind of a little bit of a
weird outlier on things. Okay. So think about this for a second. You've got browsers that
kind of handle things in slightly different ways. What can we do with that kind of stuff?
But, you know, I'm going to jump in here. I'm going to jump in here for a second. I'm
going to jump in here for a second. The majority of stuff is like, okay, that's just content.
I don't care what it is. It's loading stuff. It's like you get a 400 response and it's
like, oh, I see HTML. I'm just going to render that for you. Or I see JavaScript. I'm just
going to run that JavaScript for you. So most things, mostly, things are loaded quite
normally. But there's some weird kind of outliers. So with HTML responses, almost all response
codes are rendered correctly. It doesn't care. When you try and load an iFrame and it comes
back, there's some special cases for IE because IE is special. But most of the time kind
of things are even. And if you start looking at JavaScript, there's very limited support
for it. Okay. Chrome being the exception because they just don't care.
Okay. So we know what browsers interpret differently. So what do all the browsers have in common?
What are they doing the same across the board? The 100 codes, retries, confusion, as I said,
fun on Android.
And it times out eventually. Okay. Because the browser thinks there's more coming. The
browser thinks, oh, you're going to send more data in a minute. I'll just kind of sit here
and wait for you. Which is kind of interesting. The 200 codes, again, you get the no content
or not modified. You just get headers saying, no, there's nothing here. So as you would
expect, all the browsers just ignore any kind of content that you're responding because
it doesn't expect there to actually be any content within those things.
So what about headers? Okay. So RFCs quite a lot of the time in kind of a muddy language
say if you're responding with a 3 XX response code, whether it's 301 or 302 or 303, there
should be a location header. Okay. That doesn't mean it has to. If you respond and you don't
have a location header, it just kind of ignores the fact that it's meant to do a redirect
and then renders whatever content you give it. Specifically no location header, no redirect.
This makes sense. Because it's also part of the problem of the code, because you don't
sense. Because you're responding with a 302, it's looking for the location header, it doesn't
find one, instead of giving you an unhelpful error in your browser, it's going to render
what you returned and ignore. Simple as that. So the 401 unauthorized as well, if you're
not sending back a WWW authenticate header, it's not going to prompt you. Simple as that.
And I mentioned previously about the 407 proxy authentication required, the way Chrome deals
with it, if you're not requesting the authentication, then it's never going to prompt you.
On the flip side, just because an RFC says a specific status code shouldn't have a header,
it doesn't mean it can't have a header. So if you read the RFCs, there's 300 multiple
choices. Now, this shouldn't have a location header. It doesn't redirect you. It should
come up with an HTML where you can specifically select where you would like to go. And let
us, of course, your Firefox or IE, in which case if you give it a location header, it's
just going to redirect. But Chrome isn't. Okay? And there are so many headers out there
you can play with. I've played with quite a lot of them. Most of them are not particularly
interesting, unfortunately. But there's more work to be done in that kind of area. There's
a load of headers like the retry after header that can really be played with and a little
bit more research is required there. So each browser is handling something a little
bit differently. We know how things are handled differently. We know how things are handled
the same. We know how things are handled differently. I wonder what we can do with
that, you know? So what can we do with that? What are the
goals? Okay. So each browser handles things differently. You have the handled codes. You
have the unhandled codes. And then you get this little browser weirdness thing where it
just does random stuff you just didn't expect it to do, depending on the headers.
So browser fingerprinting. Okay. So, yeah, you can check user agent strings, but that's
client side controlled.
You can easily spoof that stuff. But if you take the differences, you can really do some
interesting fingerprinting work on Firefox, on Chrome, and on IE. So on Firefox, based
on the information we have, it doesn't load JavaScript returned with a 300. Okay? The
other browsers do. So you simply return a JavaScript with a 300, and if the JavaScript
runs, it's Firefox. Simple as that. Okay? With Chrome as well, it loads Java returned
with a 307, temporary redirect, without a localization. Okay? So that's a little bit
more complicated.
So we can do the same kind of thing, and we can do some fingerprinting. And with IE, it
loads JavaScript returned with a reset content status code. And nothing else does. So if
we can add all that stuff together, you can get a nice way of fingerprinting the main
browsers without even bothering to look at the location header. Or you can use the user
agent header, and then specifically say, well, I'm going to check that by seeing how your
browser really works. Okay?
So I'm going to do a quick demo. Well, actually, I lied. I'm going to run a quick video of
a demo, because I didn't want to connect to the network here.
First talk. Yeah, so all this is doing is loading a PHP page, and then running through
and loading three individual pages. It then checks the responses, checks the JavaScript,
and says, okay, so this JavaScript ran. This JavaScript didn't. So you must be using this
specific browser. Okay? So if you zoom in, well, if I zoom in, there you go. So you can
see the specific responses. You can see that it's loading an HTML. It's coming back with
a 300, a 307, and a 205, which are the three specific response codes we talked about for
the different browsers. It's then sending all of that stuff to a PHP page on the server.
And the PHP page is returning and saying, okay, this is the specific browser. So obviously,
I'm returning back to the browser.
I don't want to display a popup to say this is the browser. In most cases, you're responding,
you're sending it to the server, and then you're never responding back, because I know
I'm running IE. I have the little IE bar at the top. I don't need you to do a popup
and tell me. But the server now knows exactly what kind of browser you're using. So if you're
spoofing a user agent string, then I can say, okay, so user agent string says you're
running Firefox, but you're actually running Chrome. That's suspicious. And that's something
that we should be looking at.
Okay. So there's various other options for fingerprinting. The specific option I selected
was kind of the easiest option. There's various other stuff. There's a 300 redirect, there's
a 400 iframe on an Internet Explorer. If you want to look at the proof of concept,
if you go to C22.cc, proof of concept, fingerprint.html, it will run the same example that I just ran.
You can look at the traffic and the code is available. I'll link to that at the end.
So user agents can be spoofed. Okay.
Everyone knows that. Even script kiddies, unfortunately, know that. But browser traits
are really hard. Okay. Because your browser responds and does things in specific ways.
Okay. So we've done that. We can fingerprint browsers. So what else can we do? Proxy detection.
We specifically talked about the way Chrome handles things. And if you have an HTTP proxy
set, you can specifically say, okay, I know they're using Chrome. I'm going to respond
with a 407. If it loads the page, then they're using an HTTP proxy. Okay. There's, unfortunately,
limitations here.
It doesn't pick up with SOX proxies. But your average user isn't using a SOX proxy anyway.
So it's a limited interest, but it's something that needs further testing. So, again, as
I said, all you do is respond with a 407 with a proxy authentication header and ‑‑ or
without a proxy authentication header. And if Chrome responds, then HTTP proxy is set.
So while I was doing this research, I decided I was going to try a couple of different HTTP
proxies.
One of the proxies I selected was provoxy. It's quite a popular privatizing HTTP proxy.
So I found while I was testing this, if you respond with a ‑‑ so your web server,
you go to my web server, you respond with a 407 proxy authentication required, you get
the pop‑up in your browser. But it doesn't say that my web server asked you for a username
and password. It says provoxy asked you for a username and password, which is interesting.
So why is my local proxy asking for a username and password? Why is my local proxy asking
for a username and password, I thought. So I typed in test, test, and I clicked send,
and my web server gets my username and password. That's kind of interesting. I'm sure we can
use that for some malicious stuff. But this is a defensive talk, so I'm not going to dive
too much into that. But it's kind of interesting. Let's just say proxies aren't always configured
quite so securely as they could be. There's a fix for that now, so you can download the
latest version. But it's not just provoxy. Any transparent
versions like burp or zap, they're specifically designed to just pass everything you give
it. So if my web server responds with a 407 and asks for authentication, burp suite will
just pass it to your browser. Simple as that. So you can start screwing with people who
you know are doing malicious things on your site with intercepting proxies. Of course,
if you're doing a test with burp suite and it pops up and says what's your username
and password, I'm probably not going to type in my real username and password. But you
never know. Script kiddies type in weird stuff, so.
So let's bring all this stuff together. So we've got status codes that are treated
like content. They don't care. You've got status codes that really are not specifically
well handled, specifically the 100 codes. And you've got lots of little browser quirks
that we can abuse. So what can we do? We can play with things, just in case there's
children in the room. We can make the people who like RFCs cry into their beer. So let's
try to use what we've described here. Let's try to use what we've described here. Let's
break some spidering tools. Cause some false positives, false negatives. Slow down attackers,
which is probably one of the most important things that we can do. Give us time to respond
to how people are attacking us. And then block successful exploitation. So even if
they do manage to exploit the server, if you're responding specifically with different codes,
maybe their exploit is not going to work.
So let's talk a little bit about spiders. This is a very simplistic and naive view of
this. But you access the target URL, you read the links, you test them. If true, continue.
So if you get a 200, okay, that's true. Right? And if you get a 404, that's false. Right?
But what happens if everything is a 200? And what happens if everything is a 404?
You start to get some weird stuff. And what happens if everything is a 500? Sometimes
if everything is a 500, then everything is a SQL injection attack. So if everything's
a 200, you end up with this interesting loop of, oh, I found another directory. I will
just keep scanning and keep scanning and keep scanning. You get this neverending spider.
Unfortunately I couldn't find a picture of a spider eating itself. So if anyone has one,
please send it to me. And if everything is a 404, what website? I don't know if you can
see this at the back. This is the wonderful acunetics tool, the ScriptKitties tool of choice.
Stages, zero files, and validated zero findings. Yeah. So what website? There is no website
there. Skipfish loves it. It just keeps going and going and going until it kills itself.
Because I'm guessing at this point my test server just decided it didn't have enough
memory to deal with all the responses. If you look closely, there's also 2,000 low,
2,000 medium false findings on the scan alone.
Playing with people's spiders is kind of interesting. So false positives, false negatives.
So we can start to hide how bad or how good our servers are, start to really screw with
people and waste their time. So most scanners use response codes in some way. They kind
of have to. You speed up detection. You can't use regex for everything. It's the easy solution.
So if we start to respond,
with, again, 200 okay, 404, 500, and if we start to play with them and respond with random
codes, random being a selection of codes that are handled well by all normal browsers so
the normal people browsing the website are not going to be affected in any way, you start
to see some really interesting stuff. So it's a quick baseline using W3AF. I didn't particularly
pick on these people. It just happens to be an interesting baseline. So a standard baseline
with 39 informational points, 65 vulnerabilities, no shells, which kind of made me a sad panda,
but no shells. And it took an hour and 37 minutes to do a scan. So if everything responds
with a 200 okay, you still get everything. So we're not winning on the false positives
and false negatives. It takes you nine hours to run the scan, which is kind of interesting.
It lets us win time. But it's not really working. So 200 okay isn't really going to do what
we need it to do.
So if everything is a 404, it's a lot quicker to do the scan because you're not finding
everything. But it's missing a majority of the information points and a majority of the
vulnerabilities. So you're starting to see some interesting stuff. If we start responding
with weird codes, they don't find everything. Okay. That's interesting. If you respond with
everything with a 500, wow. False positives. If it's a 500, like I said, it's SQL injection.
Okay. 9,000 informational points.
Try digging through that report, you know. 9,000 confirmed vulnerabilities. I could
just see that pen test report. Sorry. That vulnerability analysis. That's going to be
about 1,000 pages long. It's like, oh, no, we found all this stuff. We have IIS vulnerabilities
and this and that. It's an Apache server, but we still have all of these IIS vulnerabilities.
So is there any people who use Nessus in the room? No. Okay. So if we start using random
status codes.
Okay. I didn't. I didn't. I didn't. I didn't. I didn't. I didn't. I didn't. I didn't. I didn't.
I didn't. I didn't. I didn't. So if I do multiple runs of this to make sure that I'm
getting accurate. Because if it's being random, maybe I just had a bad run. It averaged out
as giving you a reasonable amount of false positives. And it took less time to run the
scan. What I found interesting on this was the majority of the things that it did find.
It didn't find the real vulnerabilities. It just found weird stuff. So even though
it found more vulnerabilities and you get lots of false positives, they're pretty much
all false positives.
So the real stuff just doesn't get found at all.
So skip fish and random, wow.
It doesn't like random.
Let's just say skip fish is particularly picky about it.
So the first scan time took ten hours and the second scan time took four seconds.
Yeah, and then again 16 hours.
I ran it about five times and it would just randomly flick between times.
So skip fish is a wonderful denial of service tool for web applications.
I think in my proxy it sent like 33,000 requests inside five, six minutes.
It was just ‑‑ yeah, it will pretty much take down everything.
So we're not really slowing attackers down.
So what can we do to slow the attackers down?
What's our WAF really doing at the moment?
A standard WAF, again, a naive view, oh, my God, I'm being attacked, block or return
an error.
Whether or not that's a 403, a 500 or a 404 or a 200 with a nice message telling them
to piss off, profit, there's no profit there.
For us as defenders, with my defender hat on, we've won nothing.
All we've done is blocked an IP address or blocked an attack.
They come back with an obfuscation, we bypass it, game over.
So why are we doing that?
Remember this big list of status codes that browsers don't handle very well?
Specifically the 100 stuff?
Scanners don't like them either.
Surprising there.
A scanner thinks it's going to be a browser.
It's trying to do everything that a browser does.
So looking at the 100 codes, we can start to really screw with stuff.
So does anyone remember the Libre topic?
So that should be Tom Liston, not Tim Liston.
Apologies, Tim.
So it was originally designed to slow the spread of code red.
And it slows down attackers.
I mean, this was a great idea.
Did we forget that this was a great idea?
Did we think that, okay, well, it's been done now, we can just forget about it.
In our drive to kind of find new and interesting research, it's been done once, so we should
just ignore it for the rest of time.
So I had this interesting idea.
How about an HTTP tar pit?
People have probably talked about this a thousand times before, but, you know, it was interesting
to me.
Whoa.
This is the problem when you're on PowerPoint.
Drink.
Drink.
Drink.
Oh.
Morning, DEF CON.
It's Sunday morning.
You know what that means?
Drink.
Drink.
That's right.
So we're out of time.
Let's give a round of applause for our first time speaker.
How's he doing so far?
Doing okay?
All right.
Hook us up.
Hook us up.
I am not drinking all of those.
All right.
Who's first time at DEF CON?
First hand up.
Right here.
Come on up on stage.
If it's Sunday morning, that means this is the hardcore, right, right here.
You guys all got up.
Good job.
All right.
Congratulations.
Cheers.
Cheers.
Cheers.
Cheers.
For all of you.
That reminds me of last night.
So where was I?
Oh, yeah, I was in the tar pit.
So simple scenario.
So the WAF detects the scan.
Again, we're at the oh my God attack section.
Adds the IP address to the naughty list.
And then it starts to rewrite all the responses.
So you get the.
the usual 101, 102 status codes. We just randomly rotate between them depending on how bored
we are at the time. We can also use 204, 304 which could be useful but it's not nearly
as fun as the 100 status codes. So let's do some experimentation, shall we? There's no
actual science included in this. So Nikto, wonderful tool. I like it a lot. I especially
like the logo. So the baseline scan, 2 minutes 18 seconds to find 18 findings. Simple as that.
So with the tar pit, yeah, we're winning some time there. Let's just say that. It's like
a 340 fold increase in time. But it's still finding quite a lot of stuff. This is mostly
informational stuff like you have an Apache whatever version as your server.
And even if you respond with 100 code, it's still going to get that header. So most of
that stuff is kind of uninteresting. Some of the findings, they disappear. And the script
kitty spends 14 hours scanning your web server. So we're kind of winning.
So W3AF, again, same baseline as before, 1 hour, 37 minutes, 65 findings. But wait
a minute. This is going in the wrong direction. It's 18 minutes instead of 1 hour, 37. It's
weird. That shouldn't be happening. But it didn't find anything.
I'm guessing there was some kind of algorithm there that said I'm just going to stop bothering
to scan your web server now because it's kind of weird and I don't know what the fuck's
going on.
So yeah. So yeah. Back to the denial of service skip fish tool. 18 minutes, 10 seconds to
find around. Two and a half thousand low, mediums and a couple of highs. Which unfortunately
even on the baseline were mostly false positives. But, you know, whatever. Each to their own.
So five seconds.
Again, we're going in the wrong direction, okay? But there was no lows and there was
no mediums and there was only three highs. What I thought was interesting was the three
highs that it found were not any of the 12 highs that it found previously.
So not only false positives but different false positives to the normal scan. So, okay,
well, it's kind of doing weird stuff. And we like weird stuff because we're mucking
around with automated scanners and we're screwing with script kitty. So random is good. So Acunetix,
as I said before, the script kitty tool of choice. So you get a one hour scan. It's quite
reasonable. With a huge amount of informational stuff that you don't care about. And again,
we're going in the wrong direction. And this HTTP target should be slowing stuff down.
But it's making stuff faster depending on what scanner you're using. But again, okay,
that's an interesting ratio of complete false negatives. So it's not finding stuff. Some
of these scanners are just like, okay, now this web server is just, you know, like, you're
going to stop bothering to scan it at all. What's interesting, it doesn't tell you that
it's just going to give up. It just says, I'm finished. It just says, no, I'm done.
So what do we find? So you can slow down some scanners, things like NICTO. Others give up
quicker because they just get tired of getting responses from the server or they time out
and say the server is not there anymore. If you look through a log file, I'm sure somewhere
along the line it says the web server stopped responding even though it didn't. But you
get a lot of unreliable and aborted scans. So up to 100% less findings. That's a win
for us. So let's move on to blocking successful exploitation. So even if someone can get past
all of this, they can find a high criticality in your web server, so we've made it hard
for them to find them, but people are going to find these vulnerabilities no matter what.
We've made it so that possibly it's going to take them 15,000 times longer to find the
vulnerabilities, but they're going to find them.
So let's stop them from popping shells with Metasploit.
Now how often does Metasploit reference status codes? So anyone care to guess? No? It's
about 1,000, give or take. This is not scientifically sound and it depends very much on how people
are wording things and how they're using their variables, but this is a simple graph through
searching for res.code, resp.code and response.code. So these are the variables that we're looking
at. These are the usual things that people use, and there's lots of dependency on status
codes. Unfortunately, even the stuff I wrote uses status codes. So I know it's bad programming,
but it's quick and it's what we all do because we use status codes to check the response
from servers. So here's an example of a simple snippet of code from one of the Metasploit
checks. So all it's doing is it's checking if the response code is less than a 200 or
more than a 300. So it's checking if the response code is less than a 200 or more than a 300.
Okay. So I can return a 500. That's great. I mean, I can return a 500 with the content.
And then it's failing. Okay. So if it's not anywhere in the 200 range, which is the okay,
then the exploit just fails. Simple as that. Okay. Great. So if we're spoofing 404 but
still giving you the content, then this exploit is always just going to fail. And if you're
good enough to go in and edit the code and change things and you really know what's going
on, then you're not really the target of this talk.
Okay. We're targeting script kiddies who know absolutely nothing. All they know how
to do is run the code, and if it doesn't work, they just cry in a corner. Interesting side
effect. If it is a 401, it just starts to print out the response headers like the www
authenticate or the authentication header. Of course, as we mentioned before, we don't
need to send those headers. So what happens if you don't send those headers? You start
to get errors within Metasploit because it's trying to print out stuff that it shouldn't
really ‑‑ it should be there. It should be there. It should be there. It should be
there. But it's actually a nil value because we haven't actually set it at all because
we haven't provided it. So that's an interesting side effect.
So no match, no shell. Okay? No cookie for you. Simple as that.
So quickly running through what we've talked about here. Okay. So we can use status codes
to our benefit. It's fun. It's useful. And we can slow people down with it. But browsers
can be quirky. So we need to do it in specific ways. And scanners and attack toolkits are
‑‑ you know, they're set in their ways. You know, this is the way we did it in 1990
and, God damn it, this is the way we're going to do it in 2013. You know? Get off my lawn.
It's just the way things are. Why change things if it's working? So my goal here is
to make it not work. WAFs need to seriously get more offensive about their defense because
they're being far too passive as far as I'm concerned. Okay? So just blocking a request,
providing a snazzy little ‑‑
I'm not talking hacking back. I don't want to start hacking people ‑‑ I don't want
to start hacking back people who attack my web service. But I want to be more active
in fighting back and saying to people that this is not right. If you're scanning my server,
I'm going to screw with you. And I'm going to screw with you until you cry. It's just
the way things down. So slowing attackers down is good.
Making life harder for skiddies is absolutely priceless. I should have put the MasterCard
logo on that. So current tools are very much the same as
APT. Yeah, I said that. They are adequate and they do what they do. And until someone
fights back and says this is not good enough anymore, they're just going to keep doing
what they're doing. They're only advanced as they need to be. Just like people attacking
you. If they can get you with a phishing attack,
then why would I bother wasting a zero day on you, simple as that.
I think this is the key to this entire talk, screwing with script kiddies is fun.
I've had this running on my web server for a while now and just checking the logs, it's
absolutely hilarious the amount of automated scans that hit your web server searching for
like Tim Thumb and how long it takes them to finish just a simple scan of your web server.
Just checking your logs, they spend days just scanning your web server for random stuff.
So how can people implement this?
There's no point in me talking about this stuff if we don't know how we can implement
this stuff.
So let's talk about the ghetto option.
We can implement it using PHP.
It's the lowest common denominator.
People probably wrote it in notepad but that's life.
So you can auto prepend a PHP file to say randomize the response code within a specific
section of response codes that are supported by PHP.
But again, we're limited by resources that are PHP handled.
So if your web server starts to error out because people are sending stuff that isn't
to be expected, then the web server is going to respond back.
So there's limited functionality there.
There's man in the middle dump, again, man in the middle proxy as I said at the beginning.
It's a real memory hog.
It will just use everything you've got.
So if you put man in the middle dump as a reverse proxy in front of your web server,
you can have some simple scripts that are just going to change the response codes.
That works.
It works.
But it's not the best solution.
So what's this enterprise approved solution?
Everyone knows NGINX, right?
It's like Apache but better.
It's a usable implementation.
NGINX is used quite often as a reverse proxy.
And if you use something like NGINX lure, you can write some interesting scripts that
are going to change the response codes that are going out of NGINX.
So you simply load NGINX lure.
And using NGINX status, you can specifically set stuff.
So the only problem is there's kind of a few bugs in the non-Git version.
Specific codes that are supported just tend to get returned as nil, which is kind of a
pain.
But if you use the version from Git, it shouldn't be so much of a problem.
But if you do an apt-get install and then install NGINX and install the optional extras,
you're going to run across a couple of problems.
So what does the future hold?
What's the next step?
Well, the next step, I've been trying to get this into mod security to ease adoption by
implementing it into something that people are using already on a daily basis.
Okay?
Because no one wants to implement another layer of stuff.
Because the more you install, you're increasing your attack surface.
So you want to put something, change a couple of configuration files in mod security and
just have it do this stuff without you having to think about it.
Okay?
But it's not simple.
It's not easy.
I've been discussing it with various people for about a year, and everyone's like, yeah,
that should be possible, maybe, kind of.
But I don't know how.
So this is kind of a long shot, but I need some help with this stuff.
Okay?
I'm not a C coder.
I'm not into writing Apache modules.
But if anyone is interested in this kind of stuff and they really want to implement it
into mod security, then I would really appreciate the help.
Okay?
So how do we counter this research?
So we've told the scanners that they're crap.
We've told the scanners that they aren't doing stuff in the right way.
I really need a new microphone here.
So less reliance on status codes.
I know it's easy to say, but we're going to have to slow scanners down in order for
them to be more reliable.
Because right now they're just taking response codes and just ignoring everything else.
So start paying more attention to the actual content of the scanners.
And some scanners are doing this already, but things need to be double checked.
So you get better matching if you do that.
Problem is, regex matching is kind of slow.
It's going to use more memory, it's going to take more time.
It's not easy.
But this is the cat and mouse game.
Every time we come up with something new or increase our security, then people who are
attacking websites or testing websites increase the productivity and increase the accuracy
of their tools.
So hopefully we can move this to the next level.
So that's all I've got.
Any questions?
So, yeah.
Have you tried to rewrite the text description to detect the specific scanners that the
kit has used?
Because right now you've got to detect them on browsers.
But if you detect them on the scanners, you can then know immediately that they're scanning
rather than using them.
So the question is, have I specifically looked at detecting specific scanners and how they
look when they're attacking websites?
Yes, I have looked at it, not as part of this research, as part of other research.
Yeah, it's interesting stuff.
You can also detect quite easily when Nikto is attacking your website.
The problem is, it's the same as this stuff.
As soon as you start detecting how specific scanners look when they hit your website,
the scanners are just going to start randomizing how they request stuff.
So it's another step in the kind of cat and mouse game.
So, yep.
People want to comment, if you have an F5 lab, you can totally write an iron
book and do all the things you would say if you'd like mod security.
Okay.
Yeah.
So F5, you can write scripts to specifically do this stuff.
So it's interesting.
So I'm sure everyone here has an F5.
So, yeah.
Do different versions of browsers respond in different ways?
So yeah, the question was, do different versions of browsers respond in different ways?
Yeah.
Yeah.
So when I was testing, I was checking all the different versions of IE.
And IE6 tends to do things in a kind of weird way.
You get some weird stuff, like with the 100 codes, it tries to download stuff, but it's
between specific versions, they don't tend to change the logic at all.
So I'm getting the wave.
So if anyone has any further questions or any further comments, the code is available
on my GitHub site.
And I'd just like to leave you with a three-part question.
Have you thought that whatever doesn't kill you makes you smaller.
Thank you.
